{% block head %}
1. As a citizen, I want to have easy access to public transportation.
From a citizen/resident perspective, I want to have a good accessibility to public transportation. I seek to understand which areas of the city provides easiest access to transportation infrastructure like bus, train and tram stops.
2. As city planners, we want to make sure the supply of transportation infrastructure will meet demand.
From a leadership & strategic perspective as a council, we seek to invest in initiatives that effectively makes it easier to access public transportation. We want to prioritize the investments in areas with high demand and areas with difficult access to transport.
Using the power of data aggregation, we can combine Melbourne Open transport datasets such as bus, train and tram stops with other data like dwellings and begin to observe, analyze and report on geographical patterns between these datasets. Besides, we can apply spatial analysis techinques such as distance analysis, to evaluate the accessibilty to public transportation.
We can ask question such as:
Goals for the exploratory data analysis:
This use case will involve some exploratory data analysis and aggregation of open data datasets. A source of inspiration for this use case comes from, the Centre for Urban Research at RMIT University. They published a critical policy brief in 2020 which articulated the Transport Impacts of New High Density Housing “Approvals for high-density housing in Australia have risen steeply, with the number of new apartments constructed each year tripling since 2009.1 In the last five years, apartments accounted for around 40% of all residential building approvals in Melbourne. 2 This has significant implications for transport and urban planning, including effects on road congestion, car parking, and overcrowding on public transport.”
This use case and exploratory data analysis project can support the City of Melbourne in the following ways:
Support for the Melbourne ‘Transport Strategy 2030’ strategic vision and goals
Influence the creation of a ‘Transport access indicator’ to monitor progress on transport accessibilty on Melbourne districts
Support further discussion between City of Melbourne and Victorian transport partner agencies to improve transport accessibility programs
Create simple map visualizations for those data to contextualize their extent.
Calculate the number of dwellings on each region. We consider 3 approaches to define the regions.
Grid of square cells
CLUE blocks
CLUE small areas
Explore how will the dwellings total change on each district over the next 20 years using dwellings forecast data.
Explore the accessibility from dwellings to bus, train and tram stops. That includes calculating:
Distance from dwellings to stops
Walking time from dwellings to stops
Number of stops in a 300 meter radius of the dwellings
Aggregate the calculated accessibility informations by Grid cell and by CLUE small areas. Then explore the correlation between number of dwellings and the access to transport infrastructure
Dataset list:
To begin the analysis we first import the necessary libraries to support our exploratory data analysis using Melbourne Open data.
The following are core packages required for this exercise:
plotly // An interactive, open-source, and browser-based graphing library. It offers Python-based charting, powered by plotly.js.
geopandas // An open source project to make working with geospatial data in python easier. GeoPandas extends the datatypes used by pandas to allow spatial operations on geometric types.
###################################################################
# Libraries used for this use case and exploratory data analysis
###################################################################
import plotly.express as px
import folium
import pandas as pd
import time
from datetime import datetime
import numpy as np
import pyproj
import requests
import plotly.express as px
import plotly.graph_objs as go
import geopandas as gpd
import shapely.geometry
import folium
from tqdm import tqdm
import matplotlib.pyplot as plt
import warnings
from shapely.geometry import box
from shapely.geometry import shape
warnings.filterwarnings("ignore")
Showing the map of Melbourne
Before starting the analysis, let's take a look at the surrounidngs of the city of Melbourne, which is our region of interest in this analysis
m = folium.Map([-37.8,145],zoom_start=12)
m
In this section, we will open the datasets necessary to perform our analysis, and also plot some of this datasets in order to get a glimpse at their content and spatial extent.
Let's open and visualize Dwellings data. This dataset contains the location and information about properties and dewllings
groupbyfields = ['CLUE small area','Block ID']
aggregatebyfields = {'Dwelling number': ["sum"],
'y coordinate':["first"],
'x coordinate':["first"]}
#dwellings_gdf = gpd.read_file('https://data.melbourne.vic.gov.au/explore/dataset/residential-dwellings/download/?format=geojson&timezone=America/Argentina/Buenos_Aires&lang=en')
dwellings_gdf = pd.read_csv('Residential_dwellings_2020.csv')
dwellings_gdf = gpd.GeoDataFrame(dwellings_gdf, geometry = gpd.points_from_xy(dwellings_gdf['x coordinate'], dwellings_gdf['y coordinate']))
#dwellings_gdf = dwellings_gdf[dwellings_gdf['census_year'] == '2020']
dwellings_gdf = dwellings_gdf.set_crs(pyproj.CRS.from_user_input('EPSG:4326'))
dwellings_gdf = dwellings_gdf.to_crs( pyproj.CRS.from_user_input('EPSG:28355'))
dwellings_gdf.head(3)
| Census year | Block ID | Property ID | Base property ID | Building address | CLUE small area | Dwelling type | Dwelling number | x coordinate | y coordinate | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2020 | 1 | 611394 | 611394 | 545-557 Flinders Street MELBOURNE VIC 3000 | Melbourne (CBD) | Residential Apartments | 196 | 144.95651 | -37.82098 | POINT (320142.225 5812080.098) |
| 1 | 2020 | 1 | 611395 | 611395 | 561-581 Flinders Street MELBOURNE VIC 3000 | Melbourne (CBD) | Residential Apartments | 189 | 144.95591 | -37.82109 | POINT (320089.677 5812066.736) |
| 2 | 2020 | 11 | 103957 | 103957 | 517-537 Flinders Lane MELBOURNE VIC 3000 | Melbourne (CBD) | Residential Apartments | 26 | 144.95666 | -37.81987 | POINT (320152.734 5812203.559) |
Plotting data on high rise dwellings in Melbourne city, where the size of bubble denotes the number of residents. To achive that, we group buildings by block ID and CLUE small area and aggregate on Dwelling number totals.
dwellingsByLocn = pd.DataFrame(dwellings_gdf.groupby(groupbyfields, as_index=False).agg(aggregatebyfields))
dwellingsByLocn.columns = dwellingsByLocn.columns.get_level_values(0)
dwellingsByLocn
fig = px.scatter_mapbox(dwellingsByLocn, lat="y coordinate", lon="x coordinate",
hover_name="CLUE small area",
hover_data=["CLUE small area", "Dwelling number"],
color_continuous_scale=px.colors.sequential.Plasma, zoom=12, height=600,
color="Dwelling number",
size = "Dwelling number"
)
fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()
Let's open dwellings forecast data. This dataset have information about the expected number of dwllings by distric in every year, ranging from 2020 to 2040.
#Dwelling forecast
dwellings_forecast_df = pd.read_csv('https://data.melbourne.vic.gov.au/explore/dataset/city-of-melbourne-dwellings-and-household-forecasts-by-small-area-2020-2040/download/?format=csv&timezone=America/Argentina/Buenos_Aires&lang=en&use_labels_for_header=true&csv_separator=%2C')
dwellings_forecast_df['Value'] = dwellings_forecast_df['Value'].astype('int')
dwellings_forecast_df['Year'] = dwellings_forecast_df['Year'].astype('int')
dwellings_forecast_df = dwellings_forecast_df[dwellings_forecast_df['Geography']!='City of Melbourne']
dwellings_forecast_df = dwellings_forecast_df.groupby(['Geography','Year']).agg('sum')['Value'].reset_index()
dwellings_forecast_df.head(3)
| Geography | Year | Value | |
|---|---|---|---|
| 0 | Carlton | 2021 | 35519 |
| 1 | Carlton | 2022 | 35821 |
| 2 | Carlton | 2023 | 36854 |
Let's open and visualize CLUE blocks and Small areas data.
There are 13 standard predefined CLUE small areas within the City of Melbourne. These small areas are named after official place names and suburbs but are different from these places and suburbs.
The CLUE blocks refer to the census aream and is divided into 606 city blocks, each of which is identified by a unique block number. These blocks are primarily bounded by main roads and also take into account similar space use.
#Creating clue_blocks geodataframe
clue_blocks_gdf = gpd.read_file('https://data.melbourne.vic.gov.au/explore/dataset/blocks-for-census-of-land-use-and-employment-clue/download/?format=geojson&timezone=America/Argentina/Buenos_Aires&lang=en')
clue_blocks_gdf = clue_blocks_gdf.set_crs(pyproj.CRS.from_user_input('EPSG:4326'))
clue_blocks_gdf = clue_blocks_gdf.to_crs( pyproj.CRS.from_user_input('EPSG:28355'))
#Creating clue small areas geodataframe
clue_small_areas_gdf = gpd.read_file('https://data.melbourne.vic.gov.au/explore/dataset/small-areas-for-census-of-land-use-and-employment-clue/download/?format=geojson&timezone=America/Argentina/Buenos_Aires&lang=en')
clue_small_areas_gdf = clue_small_areas_gdf.set_crs(pyproj.CRS.from_user_input('EPSG:4326'))
clue_small_areas_gdf = clue_small_areas_gdf.to_crs( pyproj.CRS.from_user_input('EPSG:28355'))
clue_small_areas_gdf.head(3)
| featurenam | shape_area | shape_len | geometry | |
|---|---|---|---|---|
| 0 | Parkville | 4050997.2362 | 9224.56939673 | MULTIPOLYGON (((318639.485 5815751.194, 318612... |
| 1 | Southbank | 1596010.33174 | 6012.37723915 | MULTIPOLYGON (((320102.212 5811843.555, 320282... |
| 2 | South Yarra | 1057773.39715 | 5424.13644582 | MULTIPOLYGON (((322711.120 5809393.959, 322619... |
Let's open and visualize bus stops data. This dataset has the location of bus stops in the city of Melbourne
bus_stops_gdf = gpd.read_file('https://data.melbourne.vic.gov.au/explore/dataset/bus-stops/download/?format=geojson&timezone=America/Argentina/Buenos_Aires&lang=en')
bus_stops_gdf['Longitude'] = bus_stops_gdf.geometry.x
bus_stops_gdf['Latitude'] = bus_stops_gdf.geometry.y
bus_stops_gdf = bus_stops_gdf.set_crs(pyproj.CRS.from_user_input('EPSG:4326'))
#Reprojecting to planar coordinates
bus_stops_gdf = bus_stops_gdf.to_crs( pyproj.CRS.from_user_input('EPSG:28355'))
bus_stops_gdf.head(3)
| objectid | prop_id | model_desc | addresspt | mcc_id | roadseg_id | asset_clas | model_no | addressp_1 | asset_type | str_id | descriptio | addresspt1 | geometry | Longitude | Latitude | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 355 | 0 | Sign - Public Transport 1 Panel | 570648 | 1235255 | 21673 | Signage | P.16 | 357 | Sign - Public Transport | 1235255 | Sign - Public Transport 1 Panel Bus Stop Type 13 | 76.819824 | POINT (317977.230 5813935.160) | 144.932393 | -37.803842 |
| 1 | 600 | 0 | Sign - Public Transport 1 Panel | 548056 | 1231226 | 20184 | Signage | P.16 | 83 | Sign - Public Transport | 1231226 | Sign - Public Transport 1 Panel Bus Stop Type 8 | 21.561304 | POINT (320275.850 5812692.850) | 144.958179 | -37.815487 |
| 2 | 640 | 0 | Sign - Public Transport 1 Panel | 543382 | 1237092 | 20186 | Signage | P.16 | 207 | Sign - Public Transport | 1237092 | Sign - Public Transport 1 Panel Bus Stop Type 8 | 42.177187 | POINT (320192.240 5812907.290) | 144.957283 | -37.813539 |
fig = px.scatter_mapbox(bus_stops_gdf, lat="Latitude", lon="Longitude",
hover_name="objectid",
color_continuous_scale=px.colors.sequential.Plasma, zoom=12, height=600,
)
fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()
Let's open and visualize train stops data. This dataset has the information on the location of train stops in the city of Melbourne
train_stops_gdf = gpd.read_file('https://data.melbourne.vic.gov.au/explore/dataset/metro-train-stations-with-accessibility-information/download/?format=geojson&timezone=America/Argentina/Buenos_Aires&lang=en')
train_stops_gdf['LONGITUDE'] = train_stops_gdf.geometry.x
train_stops_gdf['LATITUDE'] = train_stops_gdf.geometry.y
train_stops_gdf = train_stops_gdf.to_crs(pyproj.CRS.from_user_input('EPSG:28355'))
train_stops_gdf
| he_loop | pids | station | lift | geometry | LONGITUDE | LATITUDE | |
|---|---|---|---|---|---|---|---|
| 0 | No | Dot Matrix | Armadale | No | POINT (325759.223 5808265.019) | 145.019374 | -37.856435 |
| 1 | No | No | Aspendale | No | POINT (333415.085 5789484.185) | 145.102007 | -38.027045 |
| 2 | No | Dot Matrix | Belgrave | No | POINT (355415.969 5802995.343) | 145.355291 | -37.909099 |
| 3 | No | No | Bittern | No | POINT (340740.650 5755138.923) | 145.177739 | -38.337747 |
| 4 | No | Dot Matrix | Blackburn | Yes | POINT (337189.340 5812526.638) | 145.150197 | -37.820158 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 214 | No | Dot Matrix | Windsor | No | POINT (323325.879 5808280.293) | 144.991733 | -37.855829 |
| 215 | No | No | Yarraman | No | POINT (341187.095 5795067.192) | 145.191752 | -37.978147 |
| 216 | No | LCD | Yarraville | No | POINT (314272.541 5812523.237) | 144.889975 | -37.815813 |
| 217 | Yes | Dot Matrix | Coolaroo | Yes | POINT (317067.770 5829773.165) | 144.926053 | -37.661001 |
| 218 | Yes | Dot Matrix | Lynbrook | No | POINT (346415.148 5786377.210) | 145.249391 | -38.057332 |
219 rows × 7 columns
fig = px.scatter_mapbox(train_stops_gdf, lat="LATITUDE", lon="LONGITUDE",
hover_name="station",
zoom=11, height=600,
color_discrete_sequence = ['#FF0000'])
fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
Let's open and visualize tram stops data. This dataset has the location of tram stops in the city of Melbourne
tram_stops_gdf = gpd.read_file('metro tram stop/PTV_METRO_TRAM_STOP.shp')
tram_stops_gdf
| STOP_ID | STOP_NAME | LATITUDE | LONGITUDE | TICKETZONE | ROUTEUSSP | geometry | |
|---|---|---|---|---|---|---|---|
| 0 | 18730 | 134-Merribell Ave/Nicholson St (Coburg) | -37.744359 | 144.977728 | 1 | 1 | POINT (321825.986 5820623.023) |
| 1 | 18732 | 44-Deepdene Park/Whitehorse Rd (Balwyn) | -37.811375 | 145.068671 | 1 | 109 | POINT (329993.016 5813355.988) |
| 2 | 18733 | 45-Hardwicke St/Whitehorse Rd (Balwyn) | -37.811750 | 145.071785 | 1 | 109 | POINT (330268.014 5813320.036) |
| 3 | 18734 | 46-Balwyn Cinema/Whitehorse Rd (Balwyn) | -37.812242 | 145.075930 | 1 | 109 | POINT (330634.010 5813272.971) |
| 4 | 18735 | 47-Balwyn Rd/Whitehorse Rd (Balwyn) | -37.812919 | 145.081524 | 1,2 | 109 | POINT (331128.013 5813207.978) |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 1660 | 6037 | 34-Bent St/High St (Northcote) | -37.767614 | 144.999096 | 1 | 86 | POINT (323764.041 5818083.024) |
| 1661 | 6038 | 33-Arthurton Rd/High St (Northcote) | -37.769413 | 144.998900 | 1 | 86 | POINT (323751.030 5817883.037) |
| 1662 | 6039 | 32-Mitchell St/High St (Northcote) | -37.771138 | 144.998569 | 1 | 86 | POINT (323725.986 5817690.996) |
| 1663 | 6040 | 31-Northcote Town Hall/High St (Northcote) | -37.774712 | 144.997837 | 1 | 86 | POINT (323669.980 5817293.028) |
| 1664 | 6041 | 30-Clarke St/High St (Northcote) | -37.776690 | 144.997534 | 1 | 86 | POINT (323648.014 5817072.966) |
1665 rows × 7 columns
tram_df = tram_stops_gdf[['STOP_NAME','LATITUDE','LONGITUDE']]
fig = px.scatter_mapbox(tram_df, lat="LATITUDE", lon="LONGITUDE",
hover_name="STOP_NAME",
color_continuous_scale="rdgy", zoom=12, height=600,
color_discrete_sequence = ['#FF0000'],
)
fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()
In this section we will visualize and analyse the dwelllings distribution over the city of Melbourne
To explore this data, we will aggregate them by 3 different limits:
Here we define the functions that allow us to aggregate the data by this 3 limits.
The 'create grid' function creates a grid GeoDataFrame using the desired number of cells
The 'summarize_within' function aggregates spatial data using spatial operations
def create_grid(gdf, n_cells=15):
'''
Creates a regular grid over the extent of gdf
Returns:
A GeoDataFrame with the cells geometries
'''
# total area for the grid
xmin, ymin, xmax, ymax= gdf.total_bounds
# how many cells across and down
cell_size = (xmax-xmin)/n_cells
# projection of the grid
#crs = "+proj=sinu +lon_0=0 +x_0=0 +y_0=0 +a=6371007.181 +b=6371007.181 +units=m +no_defs"
crs = gdf.crs
# create the cells in a loop
grid_cells = []
for x0 in np.arange(xmin, xmax+cell_size, cell_size ):
for y0 in np.arange(ymin, ymax+cell_size, cell_size):
# bounds
x1 = x0-cell_size
y1 = y0+cell_size
grid_cells.append( shapely.geometry.box(x0, y0, x1, y1) )
grid = gpd.GeoDataFrame(grid_cells, columns=['geometry'],
crs=crs)
return grid
def summarize_within(input_gdf, input_summary_gdf, in_fields, out_fields = None, aggfunc='mean'):
'''
Overlays a polygon layer with another layer to calculate attribute field statistics about those features (input_summary_gdf) within the polygons (input_gdf).
Parameters:
input_gdf: Geodataframe of the polygons in which features will be summarized by.
input_summary_gdf: Geodataframe of features that will be summarized
in_fields: name of the fields (in input_summary_gdf) that will be summarized
out_fields: name that the fields will have after they're summarized
aggfunc: function that will be used to summarize
Returns:
A geodataframe with 'input_gdf' polygons and the attributes of 'input_summary_gdf' summarized by each polygon.
'''
input_gdf = input_gdf.copy()
input_summary_gdf = input_summary_gdf.copy()
if out_fields == None:
out_fields = in_fields
#Merges the dwelling points with the input_polygons. A new column "index right" is created. It indicates in what cell the property is within.
merged = gpd.sjoin(input_summary_gdf, input_gdf, how='left')
#Now lets count how many properties are within each cell
dissolve = merged.dissolve(by="index_right", aggfunc=aggfunc) #Dissolve (looks like groupby) by the cell index
for in_field, out_field in zip(in_fields, out_fields):
input_gdf.loc[dissolve.index, out_field] = dissolve[in_field].values #Putting number of properties in input_polygons gdf
return input_gdf.round(2)
Below, we create a squared grid using the functions described previously. Then we calculate the total number of dwellings on each cell of the grid, and plot this on the map.
#Creating the grid GeoDataFrame
grids = create_grid(dwellings_gdf)
dwellings_gdf['n_properties']=1 #initialize n_properties to 1, so we can sum how many properties in each grid cell
#Summarizing Dwelling GeodataFrame by sum
summarized_grid = summarize_within(
grids,
dwellings_gdf,
in_fields= ['n_properties', 'Dwelling number'],
out_fields= ['Properties', 'Dwelling number'],
aggfunc='sum')
#Summarizing Dwelling GeodataFrame by mean
summarized_grid = summarized_grid.to_crs(pyproj.CRS.from_epsg(4326))
x,y = box(*summarized_grid.total_bounds).centroid.xy
fig = px.choropleth_mapbox(summarized_grid, geojson=summarized_grid.geometry, locations=summarized_grid.index, color="Dwelling number", center={"lat": y[0], "lon": x[0]},
mapbox_style="open-street-map", zoom=11, opacity=0.7, color_continuous_scale=px.colors.sequential.YlOrRd)
fig.show()
Below we calculate the total number of dwellings on each clue block, and plot this on the map.
dwellings_gdf['n_properties']=1 #initialize n_properties to 1, so we can sum how many properties in each grid cell
#Summarizing Dwelling GeodataFrame by sum
summarized_clue_blocks = summarize_within(
clue_blocks_gdf,
dwellings_gdf,
in_fields= ['n_properties', 'Dwelling number'],
out_fields= ['Properties', 'Dwelling number'],
aggfunc='sum')
summarized_clue_blocks = summarized_clue_blocks.set_index('block_id')
summarized_clue_blocks = summarized_clue_blocks.to_crs(pyproj.CRS.from_epsg(4326))
x,y = box(*summarized_clue_blocks.total_bounds).centroid.xy
fig = px.choropleth_mapbox(summarized_clue_blocks, geojson=summarized_clue_blocks.geometry, locations=summarized_clue_blocks.index, color="Dwelling number", center={"lat": y[0], "lon": x[0]},
mapbox_style="open-street-map", zoom=11, opacity=0.7, color_continuous_scale=px.colors.sequential.YlOrRd)
fig.show()
Below we calculate the total number of dwellings on each CLUE Small Area, and plot this on the map.
dwellings_gdf['n_properties']=1 #initialize n_properties to 1, so we can sum how many properties in each grid cell
#Summarizing Dwelling GeodataFrame by sum
summarized_clue_small_areas = summarize_within(
clue_small_areas_gdf,
dwellings_gdf,
in_fields= ['n_properties', 'Dwelling number'],
out_fields= ['Properties', 'Dwelling number'],
aggfunc='sum')
summarized_clue_small_areas = summarized_clue_small_areas.set_index('featurenam')
summarized_clue_small_areas = summarized_clue_small_areas.to_crs(pyproj.CRS.from_epsg(4326))
x,y = box(*summarized_clue_small_areas.total_bounds).centroid.xy
fig = px.choropleth_mapbox(summarized_clue_small_areas, geojson=summarized_clue_small_areas.geometry, locations=summarized_clue_small_areas.index, color="Dwelling number", center={"lat": y[0], "lon": x[0]},
mapbox_style="open-street-map", zoom=11, opacity=0.7, color_continuous_scale=px.colors.sequential.YlOrRd)
fig.show()
Let's perform some analysis on the dwelling forecast between 2020 and 2040. This is important because we need to know not only which areas are impacted by the lack of public transport today, but also which areas might be affected in the future if the current transport scenario stays the same
In the cell below, we create 2 DataFrames from the original Dwellings forecast DataFrame. One of the created dataframes contains the records of Dwellings forecast for 2020 and the other one contains Dwellings forecast for 2040
dwellings_forecast_df_start = pd.DataFrame([{'small_area':group, 'value':group_df.iloc[0]['Value']} for group,group_df in dwellings_forecast_df.groupby('Geography')])
dwellings_forecast_df_end = pd.DataFrame([{'small_area':group, 'value':group_df.iloc[-1]['Value']} for group,group_df in dwellings_forecast_df.groupby('Geography')])
Below we create an interactive bar chart to visualize how the number of dwellings can potentially change between 2020 and 2040. You can click on the drop-down to choose between 'All', '2020' or '2040' Dwellings forecast.
fig = go.Figure()
fig.add_trace(
go.Bar(x = dwellings_forecast_df_start['small_area'], y=dwellings_forecast_df_start['value'], name='2020', marker = {'color':'#c49279'})
)
fig.add_trace(
go.Bar(x = dwellings_forecast_df_end['small_area'], y=dwellings_forecast_df_end['value'], name='2040', marker = {'color':'#6f7ca8'})
)
fig.update_layout(
updatemenus=[go.layout.Updatemenu(
active=0,
buttons=list(
[dict(label = 'All',
method = 'update',
args = [{'visible': [True, True,True]},
{'title': 'Number of dwellings from 2020 to 2040',
'showlegend':True}]),
dict(label = '2020',
method = 'update',
args = [{'visible': [True, False,False]},
{'title': 'Number of dwellings in 2020',
'showlegend':True}]),
dict(label = '2040',
method = 'update',
args = [{'visible': [False, True, False]}, # the index of True aligns with the indices of plot traces
{'title': 'Dwellings forecast for 2040',
'showlegend':True}]),
])
)
])
fig.update_layout(title_text = 'Number of dwellings from 2020 to 2040', title_x=0.5)
fig.show()
Let's calculate the rate of change in the number of dwellings between 2020 and 2040. That rate is given by:
$rate = (\frac{dwellings_{2040}-dwellings_{2020}{dwellings_{2020})*100$
dwelling_growth_df = []
for group, group_df in dwellings_forecast_df.groupby('Geography'):
today_dwellings = group_df.iloc[0]['Value']
future_dwellings = group_df.iloc[-1]['Value']
rate = ((future_dwellings - today_dwellings)/today_dwellings)*100
if group!='Port Melbourne':
dwelling_growth_df.append({'small_area':group, 'dwelling_growth_rate':rate})
dwelling_growth_df = pd.DataFrame(dwelling_growth_df)
fig, ax = plt.subplots(figsize=(14,4))
ax.hlines(0,0,len(dwelling_growth_df), linestyles='--', colors='black')
ax.plot(dwelling_growth_df['small_area'], dwelling_growth_df['dwelling_growth_rate'], c='r')
plt.xticks(rotation = 45)
ax.set_title('Change rate of number of dwellings between 2020 and 2040')
ax.set_ylabel('Rate (%)')
plt.show()
In this section we are interested in exploring each region's accessibility to transportation
To achieve this, below we define some functions that will help us calculate 2 important information:
The euclidian (straight line) distance between each property and the closest stop (bus, train and tram)
How many stops (bus, tram and train) exists within a 300 meters radius of each property
def get_k_closest_geom_distance(gdf1, gdf2, K):
'''
Finds the mean distance between each geometry in gdf1 and the K nearest geometries of gdf2
'''
distances = []
for point in tqdm(gdf1['geometry']):
distances.append(gdf2['geometry'].distance(point).sort_values()[:K].mean())
return distances
def get_points_in_radius(gdf1, gdf2, R):
'''
finds which points from gdf2 are within R meters from each point in gdf2
Returns:
A dictionary with 2 lists.
One of the lists shows for each geometry (in gdf1), the number of geometries (in gdf2) that are within R meters.
The other list shows for each geometry (in gdf1), the mean distance for the geometries (in gdf2) that are within R meters
'''
radius_geoms = gdf1['geometry'].buffer(R)
result = {f'number_of_geoms_within_{R}m':[],f'mean_distance_to_geoms_within_{R}m':[]}
for i, radius_geom in enumerate(tqdm(radius_geoms)):
intersection_geoms = gdf2.intersection(radius_geom)
valid_geoms_mask = ~intersection_geoms.is_empty
number_of_geoms = len(intersection_geoms[valid_geoms_mask])
if number_of_geoms>0:
mean_distance_to_geoms = np.array([gdf1['geometry'].iloc[0].distance(point) for point in intersection_geoms[valid_geoms_mask]]).mean()
else:
mean_distance_to_geoms = np.nan
result[f'number_of_geoms_within_{R}m'].append(number_of_geoms)
result[f'mean_distance_to_geoms_within_{R}m'].append(mean_distance_to_geoms)
return result
We also want to create a function to find the walking time from each property to the closest stop (bus, train and tram). This is important because the shortest straight line distance does not necessarily imply in the shortest walking time. For that, we use Open Source Routing Machine (OSRM). This is an open-source API that allows calculating routes using Open Street Maps road system.
In the code below, we define the functions that will allow us to calculate the walking time
def request_manhattan_distance(p1, p2, profile, ID=None):
'''
profile: car,bike or foot
'''
if profile not in ['car','foot','bike']:
raise Exception('profile does not exist')
from_lon, from_lat = p1
to_lon, to_lat = p2
url = f'https://routing.openstreetmap.de/routed-{profile}/route/v1/driving/{from_lon},{from_lat};{to_lon},{to_lat}?overview=false'
request = requests.get(url)
if request.status_code==200:
#distance = request.json()['routes'][0]['distance']
route_info = request.json()['routes'][0]
time.sleep(0.5)
return {'distance':route_info['distance'], 'time':route_info['duration']/60}
else:
print(request.json())
time.sleep(240)
return {'distance':None, 'time':None}
def get_manhattan_distance(gdf1, gdf2):
'''
gdf1: dwellings
gdf2: bus stops
'''
gdf1 = gdf1.to_crs( pyproj.CRS.from_user_input('EPSG:4326'))
gdf2 = gdf2.to_crs( pyproj.CRS.from_user_input('EPSG:4326'))
route_info = []
for geom in tqdm(gdf1['geometry']):
closest_geom = gdf2.loc[gdf2.distance(geom).sort_values()[:1].index]['geometry'].iloc[0]
try:
p1 = geom.x, geom.y
p2 = closest_geom.x, closest_geom.y
route_info.append(request_manhattan_distance(p1, p2, 'foot'))
except IndexError:
route_info.append({'distance':None, 'time':None})
return pd.DataFrame(route_info)
here we effectively use the functions defined previuosly, in order to calcuate the straight line distance and walking time informations between dwellings and stops.
public_transport_datasets = [('bus',bus_stops_gdf), ('train',train_stops_gdf), ('tram',tram_stops_gdf)]
#bus_route_info_df = get_manhattan_distance(dwellings_gdf, bus_stops_gdf)
#train_route_info_df = get_manhattan_distance(dwellings_gdf, train_stops_gdf)
#tram_route_info_df = get_manhattan_distance(dwellings_gdf, tram_stops_gdf)
dwellings_and_transport = dwellings_gdf.copy()
for gdf_name, gdf in public_transport_datasets:
#route_info_df = get_manhattan_distance(dwellings_gdf, bus_stops_gdf)
route_info_df = pd.read_csv(f'{gdf_name}_route_info.csv')
distances_nearest_stop = get_k_closest_geom_distance(dwellings_gdf, gdf, K=1)
R = 300
distance_radius_result = get_points_in_radius(dwellings_gdf, gdf, R=R)
stops_distances_info_df = pd.DataFrame(distance_radius_result)
stops_distances_info_df.columns = [f'n_{gdf_name}_stops_within_{R}m', f'mean_dist_to_{gdf_name}_stops_within_{R}m']
stops_distances_info_df[f'dist_from_nearest_1_{gdf_name}_stops'] = distances_nearest_stop
stops_distances_info_df[f'walking_time_nearest_{gdf_name}_stop'] = route_info_df['time'].values
stops_distances_info_df['Property ID'] = dwellings_gdf['Property ID'].values
stops_distances_info_df['Base property ID'] = dwellings_gdf['Base property ID'].values
for column in stops_distances_info_df.columns:
if column not in dwellings_gdf.columns:
dwellings_and_transport[column] = stops_distances_info_df[column]
100%|███████████████████████████████████████████████████████████████████████████| 10402/10402 [00:25<00:00, 415.88it/s] 100%|████████████████████████████████████████████████████████████████████████████| 10402/10402 [01:51<00:00, 92.95it/s] 100%|███████████████████████████████████████████████████████████████████████████| 10402/10402 [00:19<00:00, 531.54it/s] 100%|███████████████████████████████████████████████████████████████████████████| 10402/10402 [01:22<00:00, 125.34it/s] 100%|████████████████████████████████████████████████████████████████████████████| 10402/10402 [02:01<00:00, 85.28it/s] 100%|████████████████████████████████████████████████████████████████████████████| 10402/10402 [09:48<00:00, 17.69it/s]
Let's aggregate the transport access information obtained previously by the square grids that we have created earlier. Doing this we can analize the relationship between transportation and the number of dwellings on each cell
transport_by_grid = summarized_grid.to_crs(pyproj.CRS.from_epsg(28355))
#transport_by_grid = grids.copy()
public_transport_datasets = [('bus',bus_stops_gdf), ('train',train_stops_gdf), ('tram',train_stops_gdf)]
for transport_type, transport_gdf in public_transport_datasets:
#Summarizing Dwelling GeodataFrame by mean
transport_by_grid = summarize_within(
input_gdf=transport_by_grid,
input_summary_gdf=dwellings_and_transport,
in_fields = [f'n_{transport_type}_stops_within_300m',f'mean_dist_to_{transport_type}_stops_within_300m',f'dist_from_nearest_1_{transport_type}_stops',f'walking_time_nearest_{transport_type}_stop'],
out_fields= [f'Average number of {transport_type} stops within 300m of each property',f'Average distance from each property and {transport_type} stops in a 300m radius',f'Average distance between properties and the nearest {transport_type} stop',f'Average walking time to the nearest {transport_type} stop'],
aggfunc='mean')
#Summarizing Bus Stops GeodataFrame by sum
transport_gdf[f'n_{transport_type}_stops'] = 1 #initialize bus stops to 1, so we can sum how many bus stops in each grid cell
transport_by_grid = summarize_within(
input_gdf=transport_by_grid,
input_summary_gdf=transport_gdf,
in_fields = [f'n_{transport_type}_stops'],
out_fields= [f'Total {transport_type} stops'],
aggfunc='sum')
transport_by_grid[f'Total {transport_type} stops'].fillna(0, inplace=True)
#transport_by_grid = transport_by_grid.set_index('featurenam')
transport_by_grid = transport_by_grid.to_crs(pyproj.CRS.from_user_input('EPSG:4326'))
transport_by_grid = transport_by_grid.dropna(subset=['Average distance between properties and the nearest bus stop'])
transport_by_grid.head(2)
| geometry | Properties | Dwelling number | Average number of bus stops within 300m of each property | Average distance from each property and bus stops in a 300m radius | Average distance between properties and the nearest bus stop | Average walking time to the nearest bus stop | Total bus stops | Average number of train stops within 300m of each property | Average distance from each property and train stops in a 300m radius | Average distance between properties and the nearest train stop | Average walking time to the nearest train stop | Total train stops | Average number of tram stops within 300m of each property | Average distance from each property and tram stops in a 300m radius | Average distance between properties and the nearest tram stop | Average walking time to the nearest tram stop | Total tram stops | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 21 | POLYGON ((144.90633 -37.82824, 144.90646 -37.8... | 1.0 | 1.0 | 2.0 | 4265.55 | 218.73 | 4.34 | 5.0 | 0.0 | NaN | 1820.10 | 51.93 | 0.0 | 0.0 | NaN | 2729.88 | 70.1 | 0.0 |
| 29 | POLYGON ((144.90734 -37.79272, 144.90746 -37.7... | 1.0 | 3.0 | 0.0 | NaN | 667.32 | 47.29 | 0.0 | 0.0 | NaN | 525.06 | 68.44 | 0.0 | 0.0 | NaN | 697.34 | 47.3 | 0.0 |
Now, let's create an interactive scatter plot between the number of dwellings and the transportation access information of each grid cell.
By clicking the drop-down you can choose between 'All', 'bus', 'train' or 'tram' transport types. When you select one of them it is also shown (in the title of the figure) the Pearson correlation coefficient between the number of Dwellings and the accessibility information of that plot
Initially, in the code below we define a function that allows plotting the interactive scatter plot
def plot_scatter(gdf, dwelling_col ,bus_col, train_col, tram_col, title, xaxis='', yaxis = ''):
fig = go.Figure()
fig.add_trace(
go.Scatter(x=gdf[bus_col], y = gdf[dwelling_col], name='Bus',mode='markers', marker = {'color':'#c49279'})
)
fig.add_trace(
go.Scatter( x=gdf[train_col], y = gdf[dwelling_col], name='Train', mode='markers', marker = {'color':'#6f7ca8'})
)
fig.add_trace(
go.Scatter( x=gdf[tram_col], y = gdf[dwelling_col], name='Tram', mode='markers', marker = {'color':'#78a27c'})
)
all_dwellings = list(gdf[dwelling_col])*3
all_stops = []
all_stops.extend(list(gdf[bus_col]))
all_stops.extend(list(gdf[train_col]))
all_stops.extend(list(gdf[tram_col]))
all_correlation = round(np.corrcoef(all_dwellings, all_stops)[0][1],2)
bus_correlation = round(np.corrcoef(gdf[dwelling_col], gdf[bus_col])[0][1],2)
train_correlation = round(np.corrcoef(gdf[dwelling_col], gdf[train_col])[0][1],2)
tram_correlation = round(np.corrcoef(gdf[dwelling_col], gdf[tram_col])[0][1],2)
fig.update_layout(
updatemenus=[go.layout.Updatemenu(
active=0,
buttons=list(
[dict(label = 'All',
method = 'update',
args = [{'visible': [True, True,True]},
{'title': title + f' (correlation:{all_correlation})',
'showlegend':True}]),
dict(label = 'Bus',
method = 'update',
args = [{'visible': [True, False,False]},
{'title': title + f' (correlation:{bus_correlation})',
'showlegend':True}]),
dict(label = 'Train',
method = 'update',
args = [{'visible': [False, True, False]}, # the index of True aligns with the indices of plot traces
{'title': title + f' (correlation:{train_correlation})',
'showlegend':True}]),
dict(label = 'Tram',
method = 'update',
args = [{'visible': [False, False, True]}, # the index of True aligns with the indices of plot traces
{'title': title + f' (correlation:{tram_correlation})',
'showlegend':True}]),
])
)
])
fig.update_layout(title_text = title, title_x=0.5, xaxis_title=xaxis, yaxis_title=yaxis,)
return fig
Scatter plot of the average distance between properties and the closest stop (bus, train or tram) on each cell
fig = plot_scatter(
transport_by_grid,
'Dwelling number',
'Average distance between properties and the nearest bus stop',
'Average distance between properties and the nearest train stop',
'Average distance between properties and the nearest tram stop',
'Average distance to the closest stop' ,
xaxis='Distance (m)',
yaxis='Dwellings'
)
fig.show()
Scatter plot of the average walking time between properties and the closest stop (bus, train or tram) on each cell
fig = plot_scatter(
transport_by_grid,
'Dwelling number',
'Average walking time to the nearest bus stop',
'Average walking time to the nearest train stop',
'Average walking time to the nearest tram stop',
'Average walking time to the nearest stop' ,
xaxis='walking time (min)',
yaxis='Dwellings'
)
fig.show()
Scatter plot of the average number of stops (bus, train or tram) in a 300 meters around properties on each cell
fig = plot_scatter(
transport_by_grid,
'Dwelling number',
'Average number of bus stops within 300m of each property',
'Average number of train stops within 300m of each property',
'Average number of tram stops within 300m of each property',
'Average number of stops within 300m of each property'
)
fig.show()
Let's aggregate the transportation accessibility information by CLUE small area. This way we can explore how each district performs
transport_by_clue_small_area = clue_small_areas_gdf.copy()
public_transport_datasets = [('bus',bus_stops_gdf), ('train',train_stops_gdf), ('tram',train_stops_gdf)]
for transport_type, transport_gdf in public_transport_datasets:
#Summarizing Dwelling GeodataFrame by mean
transport_by_clue_small_area = summarize_within(
input_gdf=transport_by_clue_small_area,
input_summary_gdf=dwellings_and_transport,
in_fields = [f'n_{transport_type}_stops_within_300m',f'mean_dist_to_{transport_type}_stops_within_300m',f'dist_from_nearest_1_{transport_type}_stops',f'walking_time_nearest_{transport_type}_stop'],
out_fields= [f'Average number of {transport_type} stops within 300m of each property',f'Average distance from each property and {transport_type} stops in a 300m radius',f'Average distance between properties and the nearest {transport_type} stop',f'Average walking time to the nearest {transport_type} stop'],
aggfunc='mean')
#Summarizing Bus Stops GeodataFrame by sum
transport_gdf[f'n_{transport_type}_stops'] = 1 #initialize bus stops to 1, so we can sum how many bus stops in each grid cell
transport_by_clue_small_area = summarize_within(
input_gdf=transport_by_clue_small_area,
input_summary_gdf=transport_gdf,
in_fields = [f'n_{transport_type}_stops'],
out_fields= [f'Total {transport_type} stops'],
aggfunc='sum')
transport_by_clue_small_area[f'Total {transport_type} stops'].fillna(0, inplace=True)
transport_by_clue_small_area = transport_by_clue_small_area.set_index('featurenam')
transport_by_clue_small_area = transport_by_clue_small_area.to_crs(pyproj.CRS.from_user_input('EPSG:4326'))
transport_by_clue_small_area.head(2)
| shape_area | shape_len | geometry | Average number of bus stops within 300m of each property | Average distance from each property and bus stops in a 300m radius | Average distance between properties and the nearest bus stop | Average walking time to the nearest bus stop | Total bus stops | Average number of train stops within 300m of each property | Average distance from each property and train stops in a 300m radius | Average distance between properties and the nearest train stop | Average walking time to the nearest train stop | Total train stops | Average number of tram stops within 300m of each property | Average distance from each property and tram stops in a 300m radius | Average distance between properties and the nearest tram stop | Average walking time to the nearest tram stop | Total tram stops | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| featurenam | ||||||||||||||||||
| Parkville | 4050997.2362 | 9224.56939673 | MULTIPOLYGON (((144.94037 -37.78762, 144.94007... | 2.69 | 3154.26 | 353.82 | 6.35 | 27.0 | 0.07 | 3947.11 | 1156.19 | 21.21 | 1.0 | 2.41 | 3200.59 | 312.20 | 8.04 | 1.0 |
| Southbank | 1596010.33174 | 6012.37723915 | MULTIPOLYGON (((144.95600 -37.82310, 144.95808... | 5.07 | 890.63 | 143.56 | 3.16 | 29.0 | 0.04 | 969.37 | 890.29 | 15.72 | 0.0 | 4.03 | 938.20 | 157.87 | 4.31 | 0.0 |
In the following charts and maps visualizations we can see which CLUE small areas have more issues regarding the difficulty to reach transportation stops.
In the code below, we crate a function that allows plotting the interactive bar charts.
By clicking the drop-down you can choose between 'All', 'bus' 'train' and 'tram' transportation types
def plot_bar(gdf, bus_col, train_col, tram_col, title):
fig = go.Figure()
fig.add_trace(
go.Bar(x = gdf.index, y=gdf[bus_col], name='Bus' ,marker = {'color':'#c49279'})
)
fig.add_trace(
go.Bar(x = gdf.index, y=gdf[train_col], name='Train', marker = {'color':'#6f7ca8'})
)
fig.add_trace(
go.Bar(x = gdf.index, y=gdf[tram_col], name='Tram', marker = {'color':'#78a27c'})
)
fig.update_layout(
updatemenus=[go.layout.Updatemenu(
active=0,
buttons=list(
[dict(label = 'All',
method = 'update',
args = [{'visible': [True, True,True]},
{'title': title,
'showlegend':True},
{"coloraxis.colorscale": 'viridis'} #update layout attribute
]),
dict(label = 'Bus',
method = 'update',
args = [{'visible': [True, False,False]},
{'title': title,
'showlegend':True}]),
dict(label = 'Train',
method = 'update',
args = [{'visible': [False, True, False]}, # the index of True aligns with the indices of plot traces
{'title': title,
'showlegend':True}]),
dict(label = 'Tram',
method = 'update',
args = [{'visible': [False, False, True]}, # the index of True aligns with the indices of plot traces
{'title': title,
'showlegend':True}]),
])
)
])
fig.update_layout(title_text = title, title_x=0.5)
return fig
Bar chart of the average distance between properties and the closest stop (bus, train or tram) on CLUE Small Area
fig = plot_bar(
transport_by_clue_small_area,
'Average distance between properties and the nearest bus stop',
'Average distance between properties and the nearest train stop',
'Average distance between properties and the nearest tram stop',
'Average distance to the closests stop on each region'
)
fig.show()
Bar chart of the average walking time between properties and the closest stop (bus, train or tram) on each CLUE Small Area
fig = plot_bar(
transport_by_clue_small_area,
'Average walking time to the nearest bus stop',
'Average walking time to the nearest train stop',
'Average walking time to the nearest tram stop',
'Average walking time to the nearest stop'
)
fig.show()
Bart chart of the average number of stops (bus, train or tram) in a 300 meters around properties on each CLUE Small Area
fig = plot_bar(
transport_by_clue_small_area,
'Average number of bus stops within 300m of each property',
'Average number of train stops within 300m of each property',
'Average number of tram stops within 300m of each property',
'Average number of stops within 300m of each property'
)
fig.show()
In the code below, we crate a function that allows plotting the interactive maps.
By clicking the drop-down you can choose between 'bus' 'train' and 'tram' transportation types
def plot_map(gdf, bus_col, train_col, tram_col, title):
fig = go.Figure(go.Choroplethmapbox(geojson=gdf.__geo_interface__, locations=gdf.index, z=gdf[bus_col],
colorscale="YlOrRd", zmin=gdf[bus_col].min(), zmax=gdf[bus_col].max(),
marker_opacity=0.9, marker_line_width=0, ))
fig.update_layout(mapbox_style="open-street-map", mapbox_center = {"lat": y[0], "lon": x[0]}, mapbox_zoom=11)
matter_r= [[0.0, '#2f0f3d'], #cmocean colorscale
[0.1, '#4f1552'],
[0.2, '#72195f'],
[0.3, '#931f63'],
[0.4, '#b32e5e'],
[0.5, '#cf4456'],
[0.6, '#e26152'],
[0.7, '#ee845d'],
[0.8, '#f5a672'],
[0.9, '#faca8f'],
[1.0, '#fdedb0']]
button1 = dict(method= 'update',
label='Bus',
args=[
{"z": [gdf[bus_col]],
"zmax":[gdf[bus_col].max()],
"zmin":[gdf[bus_col].min()],
}, #dict for fig.data[0] updates
{"coloraxis.colorscale":"Viridis" } #dict for layout attribute update
])
button2 = dict(method= 'update',
label='Train',
args=[
{"z": [gdf[train_col]],
"zmax":[gdf[train_col].max()],
"zmin":[gdf[train_col].min()],
},
{"coloraxis.colorscale": matter_r} #update layout attribute
])
button3 = dict(method= 'update',
label='Tram',
args=[
{"z": [gdf[tram_col]],
"zmax":[gdf[tram_col].max()],
"zmin":[gdf[tram_col].min()]
},
{"coloraxis.colorscale": matter_r} #update layout attribute
])
fig.update_layout(updatemenus=[dict(active=0,
buttons= [button1, button2, button3])]
)
fig.update_layout(title_text = title, title_x=0.5)
return fig
Map of the average distance between properties and the closest stop (bus, train or tram) on CLUE Small Area
fig = plot_map(
transport_by_clue_small_area,
'Average distance between properties and the nearest bus stop',
'Average distance between properties and the nearest train stop',
'Average distance between properties and the nearest tram stop',
'Average distance to the closests stop on each region'
)
fig.show()
Map of the average walking time between properties and the closest stop (bus, train or tram) on each CLUE Small Area
fig = plot_map(
transport_by_clue_small_area,
'Average walking time to the nearest bus stop',
'Average walking time to the nearest train stop',
'Average walking time to the nearest tram stop',
'Average walking time to the nearest stop'
)
fig.show()
Map of the average number of stops (bus, train or tram) in a 300 meters around properties on each CLUE Small Area
fig = plot_map(
transport_by_clue_small_area,
'Average number of bus stops within 300m of each property',
'Average number of train stops within 300m of each property',
'Average number of tram stops within 300m of each property',
'Average number of stops within 300m of each property'
)
fig.show()
This analysis has provided a comprehensive starting point for inspecting the Melbourne Open Data Transport data and Dwellings dataset.
We achieved in this analysis:
We learned from this analysis:
As a preliminary view, the analysis shows that
As expected, the central area (Melbourne CBD) has by far the highest number of Dweellings (about 28 thousand). The others districts with highest numbers are Carlton, North Melbourne, Docklands and Kensington respectively.
When looking at the forecasts for the next 20 years, we see that most of the districts with high number of dwellings today will have an increase by 2040.
Regarding the transport accessibility considering walk time on each ditrict:
Bus: The district with more difficult access to bus transport is East melbourne. On average you have to walk for more then 12 minutes to get to a bus stop there. Other districts that don't perform so well are Docklands, Parkville and Kensington.
Train: The district with more difficult access to train transport is Docklands. On average you have to walk for more then 25 minutes to get to a train stop there. Other districts that don't perform so well are Melbourne (Remainder), Carlton and Parkville.
Tram: The district with more difficult access to train transport is Docklands. On average you have to walk for more then 15 minutes to get to a train stop there. Other districts that don't perform so well are Melbourne (Remainder), Kensington and Southbank.
Thus, we can observe that the Docklands district is experiencing a delicate scenario, since it is between the worst performing districts considering all transport access metrics we have used on this analysis. This is aggravated by the fact that it is between the districts with the highest number of dwellings today, and it is also the one with the highest dwelling growth projection.
When analyzing the correlation between dwellings and transport access across the city (using grid aggregation) we notice that there is actually a positive correlation between them. This is good since it means that areas with more demand have more transport access. However there is always room for improvement. Analysing the spatial variables described on this notebook along with other ones may help transport planners drive more accurate decisions on where to allocate resources to better serve Melbourne citizens.
Future work: